比利时专利BE1022636B1 METHOD AND METHOD OF CORRECTING PROJECTIVE DISTORTIONS WITH MULTIPLE LEVEL ELIMINATION STEPS

专利PDF首页>>比利时专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
A method comprising the steps of image binarization, connected component analysis, determining horizontal vanishing points, determining vertical vanishing points and projective correction. The horizontal vanishing point is determined by estimating baselines of text using position-determining pixels of pixel blobs, identifying vanishing point candidates from the baselines, and determining a horizontal vanishing point from the baselines. from the candidates. The vertical vanishing point is determined based on vertical characteristics of the text portion. The method includes a first step of eliminating, at text baselines, and a third step of eliminating at candidate horizontal vanishing points.
公开号:BE1022636B1
申请号:E2014/5137
申请日:2014-12-19
公开日:2016-06-22
发明作者:Jianglin Ma；Michel Dauw；Olivier Dupont；Muelenaere Pierre De
申请人:I.R.I.S.；
IPC主号:

专利说明:

"Method and method for the correction of projective distortions with elimination steps at multiple levels"
Technical area
The present invention relates to a method, a system, a device and a computer program product for correcting projective distortion.
Prior art
Digital cameras (hereinafter referred to as cameras) can be used to capture images. With advances in technology, digital cameras are being implemented in almost all types of digital devices. Examples of such digital devices include, but are not limited to, a mobile communication device, a tablet, a laptop and a digital personal assistant (PDA). In many cases, cameras can serve as an alternative for a document scanning scanner, since cameras can be used to capture images from a document. Images in the document may need to be processed before text recognition and / or text extraction. The image processing of the document imposes two major challenges: the poor quality of captured images due to unfavorable imaging conditions and the distortion in your captured images. The distortion may be due to the camera and / or an angle and camera positions relative to the document plane while capturing images. The distortion due to the last-mentioned point is known as projective distortion. In projective distortion, symptoms or text characters appear larger when they are closer to the camera plane and appear to shrink when they are farther away. There are known techniques to improve your image quality. However, improving the quality of the images does not help the recognition and / or extraction of text when the images of the documents are, in particular, deformed by projection. Projective distortion not only disturbs the visual interpretation of text but also affects the accuracy of text recognition algorithms.
There are techniques for correcting projective distortion. One of the currently known techniques for performing projective distortion correction uses auxiliary data. The auxiliary data may include a combination of orientation measurement data, accelerometer data, and distance measurement data. However, such auxiliary data may not be available in all electronic devices due to the lack of various sensors and / or lack of processing capabilities. Some other techniques discuss manual correction of projective distortion. Such a technique requires a user to manually identify and mark four corners of a quadrilateral that was usually a rectangle formed by two horizontal straight segments and two vertical straight segments before the distortion. Another technique requires the user to identify and mark parallel lines that correspond to horizontal lines or vertical lines prior to distortion. Correction of projective distortion is performed based on parallel corners or lines. However, manual correction of projective distortion takes time, is inefficient and prone to errors.
There are also techniques for automatic correction of projective distortions. These techniques focus on the identification of horizontal and vertical vanishing points. The vanishing points may refer to points where outlines (eg, horizontal outlines or vertical outlines) of the document in the image converge to a point. The techniques use horizontal and vertical vanishing points to correct projective distortion. However, most techniques require complicated manual settings for correction. If the image content changes, the settings must be changed manually. This limits the ability of the techniques. In addition, the existing techniques are expensive in terms of calculation, which makes their implementation difficult in small devices such as mobile communication devices. In addition, most techniques operate on the assumption that the images in the document include only text. In the case where the images of the document include a combination of text and photos, the techniques may not produce useful results or produce any results. Many techniques also work on the assumption that the text in the images of the document is formatted and / or positioned in a particular way. So when the text in the images is not formatted and / or positioned in the particular way, the techniques fail.
Disclosure of the invention
It is an object of this invention to provide a method, a system, a device and / or a computer program product for performing a projective correction of a distorted image, which do not exhibit at least one of the disadvantages mentioned above.
This object is achieved according to the invention as defined in the independent claims.
According to a first aspect of the present invention, which can be combined with the other aspects described herein, disclosure is made of a method of projectively correcting an image containing at least a text portion that is distorted by the perspective. The method comprises an image binarization step where said image is binarized. Then, the method includes a step of performing a related component analysis. The component-related analysis involves detecting blobs of pixels in said at least one text portion of said binarized image. Then, the method comprises a step of determining the horizontal vanishing points. Determining the horizontal vanishing points includes estimating text baselines by means of eigenpoints of said pixel blobs and determining a horizontal vanishing point of said at least one text portion using said baselines. text. The method further includes a step of determining vertical vanishing points for said at least one text portion based on vertical characteristics thereof. The method further includes a projective correction step, which includes correcting said perspective in said image based on said horizontal and vertical vanishing points.
In embodiments according to the first aspect, a step of separating text and photos is performed after said image binarization and before said related component analyzes, and only the textual information is kept.
In embodiments according to the first aspect, each own point may be the bottom center of a bounding box of the pixel blob concerned. The baseline text estimation step may include the confusing clean point elimination step. It is possible to detect confusing clean points that are misaligned with respect to specific points near the proper point taken into consideration. Confusing clean points may be neglected for said text baseline estimate.
In embodiments according to the first aspect, the confusing clean spots removal step may include determining the width and height of the pixel blobs, determining average values for the width and height of the blobs of pixels, determining average values for the width and height of the blobs of pixels, determining average values for the width and height of the blobs of pixels. blobs of pixels and the detection of said confusing clean points as clean points belonging to pixel blobs, at least one of the width and height of the pixel blob considered differs in a predetermined measure from said calculated average values.
In embodiments according to the first aspect, said text baseline estimation step may include a step of clustering clean points into groups of clean points. Said groups of clean points can fulfill at least one of the following conditions; a point-to-point distance between the group's own points is less than a first distance threshold, a point-to-line distance between each group's own point and a line formed by the group's own points is less than a second point of distance; distance threshold, - an angle of the line formed by the proper points of the group to the horizontal is less than a maximum angle, and - the group of clean points contains a minimum number of clean points.
The said basic text lines can be estimated based on the said groups of clean points.
In embodiments according to the first aspect, said first distance threshold, said second distance threshold, said maximum angle and said minimum number of own points can be adaptively set based on the content of the image. Said text baseline estimation step may further include a step of merging groups of clean points. Groups of clean points on both sides of a clean point left out can be merged into a larger group of clean points.
In embodiments according to the first aspect, said step of determining the horizontal vanishing point may comprise the steps of defining each of said text baselines estimated as lines in a Cartesian coordinate system, transforming each of said lines of text base defined in the Cartesian coordinate system at a data point in a homogeneous coordinate system, and assigning a level of confidence to each of the data points. Said confidence level may be based on at least the length of the text baseline concerned and the proximity of the group of clean points used to estimate the text baseline and the resulting text baseline.
In embodiments according to the first aspect, said step of determining the horizontal vanishing point further comprises the steps of grouping a number of data points having a confidence level above a predetermined threshold in a matrix. of priority samples, clustering of the data points in the priority sample matrix into a number of sample groups, assigning a group confidence value to each sample group on the basis of at least the confidence level assigned to each data point in the sample group and iterative selection of sample groups of data points in the priority sample array for line adjustment purposes. Each sample group may comprise two or more data points. Said iteration can begin with the sample group having the highest confidence value in the priority sample matrix.
In embodiments according to the first aspect, said step of determining the horizontal vanishing point may comprise the steps of executing the line adjustment for the first sample group resulting in a first adjusted line and then executing the the row adjustment for each other sample group resulting in other adjusted lines, determining on the basis of the first and the other adjusted lines of a set of data points which are positioned below a threshold of predetermined distance from the first adjusted line and estimating at least a first and a second horizontal vanishing point candidate from the horizontal text baselines corresponding to the determined set of data points.
In embodiments according to the first aspect, said step of determining the horizontal vanishing point may comprise the steps of performing a projective correction based on each candidate estimated horizontal vanishing point, comparing the proximity of each horizontal vanishing point candidate to the resulting textual horizontal direction after projective correction and selection of the candidate horizontal vanishing point that is closest to the horizontal text direction of the image document after projective correction.
In embodiments according to the first aspect, said step of determining the vertical vanishing point may comprise the steps of estimating a plurality of vertical lines of text, each corresponding to the direction of one of said pixel blobs selected by a blob filtering algorithm on the text portion of the image, defining each of said vertical text lines estimated as lines in a Cartesian coordinate system, transforming each of said estimated vertical text lines into the Cartesian coordinate system at a data point in a homogeneous coordinate system and assigning a confidence level to each of the data points. Said confidence level may be based on at least the eccentricity of the shape of the pixel blob used to estimate the vertical line of text concerned.
In embodiments according to the first aspect, said step of determining the vertical vanishing point may comprise the steps of: grouping a number of data points having a confidence level above a predetermined threshold in the matrix of priority samples and clustering of the data points in the priority sample array into a number of sample groups. Each sample group may comprise at least two data points. The step of determining the vertical vanishing point includes the steps of assigning a group confidence value to each sample group based on the confidence level assigned to each data point in the sample group and iterative group selection. Samples of data points in the priority sample array for line adjustment purposes. Said iteration can begin with the sample group having the highest confidence value in the priority sample matrix.
In embodiments according to the first aspect, said step of determining the vertical vanishing point may comprise the steps of executing the line adjustment for the first sample group resulting in a first adjusted line and then executing the the row adjustment for each other sample group resulting in other adjusted lines, determining on the basis of the first and the other adjusted lines of a set of data points which are positioned below a threshold of predetermined distance from the first adjusted line and estimating at least a first and a second vertical vanishing point candidate from the vertical text baselines corresponding to the determined set of data points.
In embodiments according to the first aspect, said step of determining the vertical vanishing point may comprise the steps of executing a projective correction on the basis of each candidate estimated vertical vanishing point, comparing the proximity of each vertical vanishing point candidate of the resulting vertical text direction after projective correction and selection of the candidate vertical vanishing point which is closest to the vertical direction of text of the image document.
In embodiments according to the first aspect, said blob filtering algorithm can select blobs of pixels based on one or more of the following conditions: the eccentricity of the shape of the pixel blob considered, which represents the measurement in which it is elongated (the value is between 0 and 1, 0 and 1 are extremes: a blob whose eccentricity is 0 is in fact a circular object, while a blob whose eccentricity is 1 is a segment right), is above a predetermined threshold, the proximity of each pixel blob to the edge of the image is above a predetermined distance threshold, the angle of the resulting vertical line to the vertical direction is less than a maximum angle threshold, and the area of each pixel blob defined by the number of pixels is less than a maximum area threshold but greater than a minimum area threshold.
In embodiments according to the first aspect, said first and second vanishing point candidates can be estimated using different approximation methods selected from the group consisting of a least squares method, a weighted least squares method and a method of least adaptive camcorders.
According to a variant of the first aspect of the invention, which may be combined with the other aspects described herein, disclosure is made of a method of projectively correcting an image containing at least a text portion which is distorted by the perspective. The method includes an image binarization step where said image is blnarized and a related component analysis step. The component-related analysis involves detecting blobs of pixels in said at least one text portion of said binarized image. For each of these pixel blobs, a position-determining pixel may be selected on a baseline of the pixel blob. The position determining pixel can define the position of the pixel blob in the binarized image. The method further comprises a step of determining horizontal vanishing points. The determination of horizontal vanishing points includes estimating text baselines by means of eigenpoints of said pixel blobs and determining a horizontal vanishing point of said at least one text portion by means of said baselines. text. The method further includes determining vertical vanishing points. The vertical vanishing point is determined for the at least one text portion based on vertical characteristics thereof. The method further comprises a projective correction step, wherein said perspective distortion in said image is corrected on the basis of said horizontal and vertical vanishing points.
In embodiments according to the variant of the first aspect, a step of separating text and photos is performed after said image binarization and before said related component analyzes, and only the textual information is kept.
In embodiments of the variant of the first aspect, said position-determining pixel as described may be the bottom center of a bounding box of the pixel blob. The position-determining pixel may, in alternative embodiments, be a bottom corner (i.e., left or lower right corner) of a bounding box of the pixel blob, or another pixel that determines the position of the pixel blob or an enclosing rectangle on it.
In embodiments of the first aspect or variant of the first aspect, there may be provided systems or devices comprising one or more compatible processors and software code portions configured to perform the methods or steps described above.
In embodiments of the first aspect or variant of the first aspect, there may be provided non-temporary storage media on which is stored a computer program product comprising pieces of software code in executable format on a computing device. and configured to perform the methods or steps described above when performed on said computing device. Said computer device may be any of the following: a personal computer, a laptop, a laptop, a portable mini-computer, a tablet computer, a smartphone, a digital camera, a video camera, a personal computer mobile communication, a personal digital assistant, a scanning digitizer, a multifunction device or any other similar computing device.
In a second aspect according to the invention, which can be combined with the other aspects described herein, a method of determining candidate vanishing points of a text portion in an image document which is distorted by perspective is described. The method comprises an image binarization step where said image is binarized. Then, the method comprises performing a related component analysis, in which blobs of pixels are detected in said at least one text portion of said binarized image. A position-determining pixel is selected by each of the pixel blobs on a baseline of the pixel blob, said position-determining pixel defining the position of the pixel blob in the binarized image. The method also includes estimating a number of lines of text in a Cartesian coordinate system, each line of text representing an approximation of a horizontal or vertical text direction of said text portion, based on the determining pixels of position. The method also includes transforming each of said text lines into a data point in a homogeneous coordinate system. The method further includes assigning a confidence level to each of the data points. The method includes grouping a number of data points having a confidence level greater than a predetermined threshold into a priority sample array. The method includes clustering the data points in the priority sample array into a number of sample groups. Each sample group may comprise two or more data points. The method further includes a step of assigning a group confidence value to each sample group based on at least the confidence level assigned to each data point in the sample group. In addition, the method includes applying a RANSAC algorithm to determine from among said data points a set of inliers (valid data) relative to a first adjusted line. The RANSAC algorithm is started with the sample group with the highest group confidence value in the priority sample array. The method further comprises a step of estimating at least one leak point candidate of the text lines corresponding to said set of inliers.
In embodiments according to the second aspect, a step of separating the text and photos is performed after said image binarization and before said related component analyzes, and only the textual information is retained.
In embodiments according to the second aspect, the confidence level that is assigned to said data points may be based on at least the length of the relevant text line and the proximity of the position-determining pixels of the relevant text line.
In embodiments according to the second aspect, the RANSAC algorithm may comprise the following steps. First, iterative selection of sample groups of data points in the priority sample array for line adjustment. Iteration begins with the sample group with the highest group confidence value in the priority sample matrix. Then, executing the line adjustment for the first sample group results in a first adjusted line and then executing the line adjustment for each other sample group resulting in other adjusted lines. Then, determining on the basis of the first and the other adjusted lines of a set of data points which are positioned below a predetermined distance threshold from the first adjusted line, said set of data points forming said set of inliers.
In embodiments according to the second aspect, the predetermined distance threshold from the first adjusted line may be a fixed parameter. The predetermined distance threshold from the first adjusted line may alternatively be adaptable based on the contents of the image document.
In embodiments according to the second aspect, at least a first and a second vanishing candidate can be estimated from the text lines corresponding to said set of inliers. The first and second vanishing point candidates can be estimated using different approximation methods selected from the group consisting of: a least squares method, a weighted least squares method, and an adaptive least squares method. The method may then further include a step of selecting a vanishing point among the estimated vanishing point candidates. The selection may comprise the steps of: performing a projective correction on the image document based on each estimated estimated vanishing point, comparing the proximity of each candidate vanishing point of the resulting horizontal or vertical text direction after correction projective and selection of the candidate vanishing point that is closest to the horizontal or vertical text direction of the image document after projective correction.
In embodiments according to the second aspect, the group confidence value for each sample group may further be based on the distances between the respective estimated text lines corresponding to the data points in the sample group. The confidence level of each of the data points may further be based on a dominant direction of the pixel blobs to estimate each line of text concerned. The dominant direction can be defined by the eccentricity of the shape of each blob of pixels. The maximum number of data points grouped in the priority sample array can be between 2 and 20 and even better 5 and 10.
In embodiments according to the second aspect, the estimated text lines may be vertical text blob lines that each correspond to the direction of one of said pixel blobs selected by a blob filtering algorithm on the text portion. of the image.
In embodiments of the second aspect, there may be provided systems or devices comprising one or more compatible processors and software code portions configured to perform the methods or steps described above.
In embodiments of the second aspect, there may be provided non-temporary storage media on which is stored a computer program product comprising pieces of software code in a format executable on a computing device and configured to perform the processes or steps described above when performed on said computing device. Said computer device may be any of the following: a personal computer, a laptop, a laptop, a mini-laptop, a tablet computer, a smartphone, a digital camera, a video camera, a mobile communication apparatus, a personal digital assistant, a scanning digitizer, a multifunction device or any other similar computing device.
In a third aspect of the invention, which may be combined with the other aspects described herein, disclosure is made of a method of projectively correcting an image containing at least a text portion that is distorted by the perspective. The method comprises an image binarlsation step where said image is binarized. Then, the method includes a step of performing a related component analysis. The component-related analysis involves detecting blobs of pixels in said at least one text portion of said binarized image. A position-determining pixel is selected for each of said pixel blobs on a baseline of the pixel blob. The position-determining pixel defines the position of the pixel blob in the binarized image. The method includes a step of determining horizontal vanishing points. Determining horizontal vanishing points includes estimating text baselines using position-determining pixels of said pixel blobs, identifying candidates for horizontal vanishing points of said estimated text baselines, and determining a horizontal vanishing point of said at least one text part by means of said horizontal vanishing point candidates. The method also includes a step of determining vertical vanishing points for said at least one text portion based on vertical characteristics thereof. The method further comprises a projective correction step. The projective correction comprises correcting said perspective in said Image based on said horizontal and vertical vanishing points. The determination of horizontal vanishing points may include a first elimination step at the clean points, a second elimination step at the text baselines and a third elimination step at the candidate horizontal vanishing points. .
In embodiments according to the third aspect, a step of separating text and photos is performed after said image binarization and prior to said related component analyzes, and only the textual information is retained.
In embodiments according to the third aspect, the first step of eliminating comprises a step of detecting confusing clean points that are misaligned with respect to the points proper points in the vicinity of the own point considered. Said confusing points may be neglected for said text baseline estimate.
In embodiments according to the third aspect, the confusing clean spot removal step may comprise determining the width and height of the pixel blobs, determining average values for the width and height of the pixels, and blobs of pixels and the detection of said confusing clean points as clean points belonging to pixel blobs, at least one of the width and height of the pixel blob considered differs in a predetermined measure from said calculated average values.
In embodiments according to the third aspect, said text baseline estimation step may include a step of clustering clean points into groups of clean points. Said groups of clean points can fulfill at least one of the following conditions; a point-to-point distance between the group's own points is less than a first distance threshold, a point-to-line distance between each group's own point and a line formed by the group's own points is less than a second point of distance; distance threshold, - an angle of the line formed by the proper points of the group to the horizontal is less than a maximum angle, and - the group of clean points contains a minimum number of clean points.
Said text baselines can then be estimated based on said groups of clean points.
In embodiments according to the third aspect, said first distance threshold, said second distance threshold, said maximum angle and said minimum number of own points can be adaptively set based on the content of the image. Said text baseline estimation step may further comprise a step of merging groups of clean points in which the groups of clean points on both sides of a neglected clean point are merged into a larger group of clean points. .
In embodiments according to the third aspect, the second elimination step includes the steps of: assigning confidence levels to said text baselines and eliminating text baselines based on said confidence levels. Said confidence levels may be based on at least the length of the text baseline concerned and the proximity of the group of clean points used to estimate the text baseline and the resulting text baseline. The elimination of text baselines can be performed using a RANSAC algorithm in which said confidence levels are taken into account.
In embodiments according to the third aspect, the third elimination step comprises performing a projective correction based on each identified horizontal vanishing point candidate, comparing the proximity of each horizontal vanishing point candidate. of the resulting textual horizontal direction after projective correction and selection of the candidate horizontal vanishing point which is closest to the horizontal text direction of the image document after projective correction.
In embodiments according to the third aspect, a first and a second vanishing point candidate can be estimated from said text baselines after said second elimination step. For said estimation of said first and second horizontal vanishing points, different methods of approximation may be used, the methods being selected from the group consisting of: a least squares method, a weighted least squares method and an adaptive least squares method .
In embodiments of the third aspect, there may be provided systems or devices comprising one or more compatible processors and software code portions configured to perform the methods or steps described above.
In embodiments of the third aspect, there may be provided non-temporary storage media on which is stored a computer program product comprising pieces of software code in a format executable on a computing device and configured to perform the processes or steps described above when performed on said computing device. Said computer device may be any of the following: a personal computer, a laptop, a laptop, a mini-laptop, a tablet computer, a smartphone, a digital camera, a video camera, a mobile communication apparatus, a personal digital assistant, a scanning digitizer, a multifunction device or any other similar computing device.
Brief Description of the Drawings The invention will be explained in more detail by way of the following description and the accompanying drawings.
Fig. 1 shows a flowchart for the projective correction of a distorted image, according to an embodiment of the present disclosure.
Fig. 2 shows a flowchart for identifying a horizontal vanishing point, according to an embodiment of the present disclosure.
Figure 3A and Figure 3B to which reference may be made jointly as in Figure 3 in the text shows a clean-point clustering algorithm, according to one embodiment of the present disclosure.
Figure 4 shows a flowchart for identifying a vertical vanishing point using position-determining pixels, according to an embodiment of the present disclosure.
Fig. 5 shows a flowchart for identifying the vertical vanishing point using text line features, according to an embodiment of the present disclosure.
Fig. 6A shows an example of a binarized image having a picture as well as the text, according to an embodiment of the present disclosure.
Fig. 6B shows a resultant image after filtering the picture of the text, according to an embodiment of the present disclosure,
Fig. 7 shows an exemplary blob of pixels, according to an embodiment of the present disclosure.
Fig. 8 shows a presentation grid for a user to adjust the corners of the image, according to an embodiment of the present disclosure.
Fig. 9A shows a captured image, according to an embodiment of the present disclosure.
Fig. 9B shows an improved image resulting from projective correction, according to an embodiment of the present disclosure.
Fig. 10A shows an exemplary image for which clean points for text are identified, according to an embodiment of the present disclosure.
Fig. 1QB shows an exemplary image having superclassified dot groups according to an embodiment of the present disclosure.
Fig. 10C shows an exemplary image having consolidated clean edge groups, according to an embodiment of the present disclosure.
Fig. 11 shows an example of a text portion for which baselines are estimated, according to an embodiment of the present disclosure.
Fig. 12 shows an exemplary image with margin feature points identified in the margin, according to an embodiment of the present disclosure.
Fig. 13 shows an exemplary image having two vertical lines estimated along the same margin, according to an embodiment of the present disclosure.
Fig. 14 shows an exemplary image illustrating the fusion of estimated vertical lines, according to an embodiment of the present disclosure.
Fig. 15 shows an exemplary image illustrating a text line feature of a character, according to an embodiment of the present disclosure.
Fig. 16 shows an exemplary image illustrating selectively extracted blobs after identifying text line features, according to an embodiment of the present disclosure.
Fig. 17 shows an exemplary image showing estimated vertical textblob lines for selected pixel blobs, according to an embodiment of the present disclosure.
Fig. 18 shows an exemplary image showing vertical textblob lines that are selected for the vertical vanishing point, according to an embodiment of the present disclosure.
Modes of implementation of the invention
The present invention will be described in connection with particular embodiments and with reference to certain drawings but the invention is not however limited thereto, being limited only by the claims. The drawings described are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn to scale for illustrative purposes. The dimensions and the relative dimensions do not necessarily correspond to the actual functional embodiments (reductions to practice) of the invention.
In addition, the terms first, second, third and similar in the description and in the claims are used to distinguish between similar elements and not necessarily to describe a sequential or chronological order. The terms are interchangeable under appropriate circumstances and the embodiments of the invention may operate in other sequences than those described or illustrated herein.
In addition, the terms up, down, above, below and similar in the description and claims are used for descriptive purposes and not necessarily to describe relative positions. The terms so used are interchangeable under appropriate circumstances and the embodiments described herein may operate in other orientations than those described or illustrated herein.
The term "comprising", as used in the claims, should not be construed as being limited to the means enumerated below; it does not exclude other elements or steps. It must be interpreted as specifying the presence of the elements, integers, steps or components referred to but does not exclude the presence or addition of one or more other elements, integers, steps or components or groups of these. Therefore, the scope of the term "a device comprising means A and B" should not be limited to devices consisting solely of components A and B. It means, with respect to the present invention, that the only relevant components of the device are A and B.
Referring to Figure 1, it shows a flowchart 100 for projective correction of a distorted image. The image can be received for projective correction. The image can be examined optically to determine the quality of the image. Image review may include noise control, illumination status, character clarity, resolution, and the like. If the image quality is above a predetermined threshold, the image can be processed in step 102. If the image quality is below the predetermined threshold, the image can be pretreated to improve the quality of the image. the image. Pretreatment may involve changing the color tint, correcting brightness imbalances, sharpening adjustments, eliminating noise, eliminating / correcting motion blur, compensating for camera focus errors, and the like , to restore and improve the resolution of the image. In an exemplary implementation, the preprocessing can be performed automatically. In another exemplary implementation, toolbox options may be provided to a user to select a type of preprocessing for the image. In one embodiment, the pretreatment may be implemented using known techniques that include, but are not limited to, various image filtering methods such as Gaussian filtering and median filtering, Wiener filtering, bilateral filtering, deconvolution of Wiener, deconvolution based on the total variation, adaptive histogram equalization with contrast limitation and the like. In step 102, image binarization is performed. Image binarization may include converting the pixel values of the received image into either one (1) logic or zero (0) logic. These values can be represented by a single bit or by more than one bit, for example, as unsigned integers of 8 bits. The pixels of the received image may be grayscale pixels or color pixels or pixels represented in any other form. The values can be represented by the black color or the corresponding white color. In one embodiment, the binarization can be performed using any of the known techniques that can be broadly classified into global approaches, region approaches, local approaches, hybrid approaches or variations thereof. In an exemplary implementation, the image binarization is performed using the binarization of Sauvola. In this technique, binarization is performed on the basis of small pieces of image. When parsing statistics of the local image piece, a binarization threshold is determined using the following formula:
where m and s are mean and mean local deviations respectively, R is the maximum value of the standard deviation; and k is the parameter controlling the threshold value. The parameter k can be chosen according to the document image. In one embodiment, k can be manually set. In another embodiment, the parameter k can be automatically adjusted according to the characteristics of the text of the document image. In step 104, it is determined whether the binarized image (hereinafter Image) includes photos. If the image does not include photos, the process proceeds to step 108. If the image includes one or more photos, the one or more photos are separated from the text in step 106. Any of known techniques such as page analysis methods, text placement methods, machine learning methods and / or the like can be used to separate the one or more photos from the text.
Techniques based on page analysis methods can be used for images that are created from scanned documents or that appear substantially similar to scanned document images. Techniques based on text placement methods can be used for images with a complex background, such as having a photo in the background. Techniques based on machine learning methods can be used for any type of image. Techniques based on machine learning methods may require training samples for learning. In an exemplary implementation for separating the one or more photos from the text, a background of the document image is extracted. The document image is normalized, using the background, to compensate for the effects of uneven artwork. Then, the non-text objects are removed from the binary image using heuristic filtering in which the heuristic rules are based on the area, the relative size, the proximity of the picture frame, the density, the average contrast, the contrast of the edges and the like. Figure 6A illustrates an example of a binarized image including a photo as well as text. Figure 6B illustrates the resulting image after eliminating the photo. In step 108, the related component analysis is performed on the binarized image comprising only textual information. The component-related analysis may include the identification and labeling of connected pixel components in the binary image. Pixel blobs can be identified during the related component analysis. A pixel blob may be a region having a set of related components in which certain properties, such as color, are constant or vary within a predetermined range. For example, the word 'Hello' has five different sets of connected components, that is, each character of the word is a connected component or a blob of pixels. A position-determining pixel is identified for each of the pixel blobs. A position-determining pixel defines a position of the pixel blob in the binary image. In one embodiment, the position-determining pixel may be a clean point. The clean point can be a pixel in the center of the pixel blob baseline in the pixel blob. In another embodiment, the position-determining pixel may be a pixel A at the left end or the right end of the pixel blob baseline in the pixel blob. Other embodiments having the position-determining pixel at other locations in the pixel blob or an enclosing rectangle drawn on the pixel blob are contemplated within the scope of this disclosure. FIG. 7A illustrates an exemplary blob of pixels 702. The bounding box The bounding box 704 is formed around the connected component or blob of pixels 702. In FIG. 7A, the connected component identified is the 'A' character 702. The bounding box 704 has an own point 706 which can be defined as the center of the bottom of the bounding rectangle 704. The own point 706 may be one of the position-determining pixels used in this document. Other position-determining pixels can also be used in the projective correction. For example, the position determining pixels 708 and 710 represent the lower left end position determining pixel and the upper left end position determining pixel. The position-determining pixels can be used to estimate one or more horizontal and / or vertical lines of text in the binarized image. Each line of text represents an approximation of a horizontal or vertical text direction of the associated text portion. In step 110, a horizontal vanishing point is determined. In one embodiment, the horizontal vanishing point can be determined using text baselines determined using the position-determining pixels. Various embodiments for determining the horizontal vanishing point are described with reference to FIG. 2. In step 112, a vertical vanishing point is determined. In one embodiment, the vertical vanishing point is determined using margin lines identified using the position-determining pixels. In another embodiment, the vertical vanishing point may be determined using vertical feature characteristics of the connected components. In yet another embodiment, the vertical vanishing point is identified using margin lines and vertical line features. Various embodiments for determining the vertical vanishing point are described with reference to FIGS. 3 and 4. In step 114, the projective correction of the image is performed using the horizontal vanishing point and the vertical vanishing point. The projective correction is performed on the basis of the estimation of eight unknown parameters of a projective transformation model. An exemplary projective transformation model is provided below
In one embodiment, a horizontal projective transformation matrix and a vertical projective transformation matrix are constructed to estimate projective transformation model parameters. The horizontal projective transformation matrix and the vertical projective transformation matrix are constructed using an equation provided below.
where (vft vy) is the vanishing point, (w, h) is the width and height of the document image,
The projective correction of the image is performed using the projective matrix.
In another embodiment, the vertical vanishing point and the horizontal vanishing point can be used to identify corners of the original distorted image.
i and their corresponding locations in the document image without distortion or recorded
A projective transformation model can be estimated on the basis of the four pairs of corresponding corners. The projective transformation model can be estimated using an equation
Eight parameters can be obtained using (4) after identifying the four corners in the projection corrected image. After constructing the projective transformation model, a general trend of projective correction can be generated and displayed for review by the user as shown in Figure 8. The user can be provided with an option to accept your general trend or tools to adjust the four corners. For example, as illustrated in Figure 8, a graphical user interface element 804 may be provided with the ability for the user to adjust the corners. In response to the change in the corners due to the intervention of the user, the projective transformation model can be modified and the corresponding projective correction can be performed. In response to acceptance without change, the projective correction can be performed. The resulting image may be presented as illustrated in item 806 in FIG. 8. One skilled in the art will appreciate that additional appropriate options may also be provided to the user. An exemplary projective correction result is illustrated in Figs. 9A and 9B. Figure 9A illustrates a captured image. Figure 9B illustrates an image after projective correction. Fig. 2 discusses an exemplary method 200 for identifying the horizontal vanishing point, according to one embodiment. In step 202, the own points can be identified. Clean points can be identified through the associated component analysis of the image. Clean points are defined for all pixel blobs. In step 204, the clean points are clustered and grouped. In one embodiment, the clean points can be processed before being clustered. The treatment of clean points may include the elimination of confusing clean points. Confusing clean points can be clean points that are either above or below a text baseline. The confusing clean points can come mainly from three sets of characters: the first set includes characters that can be composed of two blobs, the smallest blob being above the text baseline, such as "j", T and the like; the second set includes characters that extend below the text baselines when printed, such as "p", "q", and "g"; and the third set includes characters such as comma (, ), dash (-) and the like The confusing clean points associated with the first and third sets of characters can be identified based on the size of the pixel blobs The size of the pixel blobs associated with the first set and the third set of characters can be significantly smaller, either horizontally or vertically, compared to other characters, so confusing clean points can be identified. s by comparing the pixel size blobs with average values for all pixels of blobs. In an implementation example, the width and height of all pixel blobs are calculated. In addition, average values for the width (mw) and height (mh) of all pixel blobs are calculated. Own points belonging to blobs of pixels whose width and / or height deviate from said average values calculated to a predetermined extent are marked as confusing clean points. In an example instance, the confusing clean points having a width beyond the range of [0.3, 5] * mw and / or a height beyond the range of [0.3, are identified as the proper points of confusion. Such confusing clean points can be eliminated without further processing.
The remaining clean points are classified and clustered into different groups of clean points so that each group of clean points includes clean points of the same line of text. An example of a clustering algorithm of clean points is described on FIG. 3. The clean-point clustering algorithm is based on the assumption that own points of the same group typically fulfill one or more of the following conditions: (1) these own points are close to one another; (2) these clean points form a practically straight line; and (3) the direction of the constructed line is close to the horizontal direction. In one embodiment, these conditions are translated into respective constraints in the clean-point clustering algorithm so that a clean-point is assigned to a specific clean-point group if at least one of the following conditions is satisfied: a point-to-point distance between that own point and the other proper points of the group is less than a first distance threshold Γ * a point-to-line distance between that own point and a line formed by the group's own points is less than a second distance threshold T i; and an angle of the line formed by the proper points of the group relative to the horizontal is less than a maximum angle Ta. In addition, to make the clustering algorithm clean points more robust, an additional constraint can be added so that the group of clean points includes at least a minimum number of clean points Tm,
In one embodiment, the constraints of the clean-point clustering algorithm, that is, the point-to-point distance threshold Td, the point-to-line distance threshold Th the threshold of maximum angle with respect to the horizontal Ta and the minimum number of clean points Tm in a group of clean points, can be adjusted adaptively on the basis of an analysis of the image, for example the analysis of the images of camera document. In an alternative embodiment, the parameters can be set manually. The Ta with respect to the horizontal direction can be shifted by about 20 degrees; Tm can be around 10 assuming there are at least 2 or 3 words in the text. It should be understood that other values may be selected for Ta and Tm. The values of Td and T may depend on the content of the text in the document image. For example, if the character size is large Td then T, can be kept larger and vice versa. In one embodiment, Td and T can be adaptively calculated as follows. A median distance Dc based on all the shortest distances between neighboring characters in a word is calculated. T, can be set to Dc and Td can be set to be 3 * Dc. These values are chosen so that Td is large enough to search for neighboring letters and words in the same paragraph, while avoiding that words belonging to adjacent paragraphs in the horizontal direction are considered to be in the same group of clean points. Setting Td large enough to search for neighboring letters and words in the same paragraph would allow identification of the paragraph margin line between the paragraph and the adjacent horizontal paragraph. In some instance examples, the space between words in a single line may cause an overclassification of clean points in a line in more than one group of clean points. Overclassification may be due to some small or large related components that may have been eliminated during the clean-point elimination procedure causing a large gap between the words. In step 206, the overclassified dot groups are consolidated by merging into corresponding groups. An exemplary self-merge algorithm can be described as follows. For each group of clean points {Q (n> = i> -1), the final left and right end points I, and r, (n> = /> = 1) can be respectively identified. The pixel blob that may correspond to the rightmost own point of the clean point group is identified. Right pixel blobs adjacent to the rightmost own point are searched for among eliminated pixel blobs (for example, blobs of pixels corresponding to clean points are confusing). In response to the identification of the neighboring right blob, the neighboring right blob can be established as the new right end point rt. The step of searching for another right pixel blob next to the new right end point as described in the previous step may be repeated until no other neighboring right blob is found. In response to the absence of a neighboring right blob, the blob's own point coordinate as r_new is recorded. With a new matrix of end points rights
a search index k is initialized to zero (0). The search index can be increased by 1: k = k + 1, and the distance between 7 * and
can be calculated. The groups of clean points corresponding to the pair of points * and
may be merged if they satisfy at least one of the following conditions: the distance between the groups of clean points is within a predetermined distance (in an exemplary implementation, the distance may be less than 0.5 * (Te /)); and the lines corresponding to the groups of clean points are close to each other (for example, the distance of the lines is less than (7})). If the groups of clean points are merged, the number of groups of clean points can be reduced by one: n = n · 1. A check can be made to determine if the search index is equal to the number of groups of points ( k == n). If the search index is not equal, then the search index is increased, and the previous steps of calculating the distance, of the merger of the clean point groups are performed if they fulfill the above defined condition. FIG. 10A illustrates an exemplary image before classification of clean points. Figure 10A illustrates clean points for pixel blobs at the text baseline. FIG. 10B illustrates an example of an image after classification of clean points into groups. The figure shows an image having groups in each of the lines of text. For example, the first line of text illustrates a group of clean points 1002. The second line of text shown in the image shows groups of superclassified clean points 1004 and 1006. You can see the overclassified groups 1004 and 1006 (two groups) in the second line of the text of Figure 10B (indicated by square and round symbols for corresponding groups of corresponding points). Figure 10C illustrates an exemplary image having consolidated clean edge groups. The overclassified groups 1004 and 1006 of the second line as illustrated in Figure 10B are consolidated into a group of clean points 1008 (indicated by plus marks). In step 208, text baselines are estimated using the resulting clustered points after the clustering and merging steps. In one embodiment, the text baselines are estimated using a method (hereinafter referred to as line estimation) based on the adaptive weighted line estimate. The line estimate a priori can assign weighting factors to each of the own points involved in the line estimate. Consider a scenario where n proper points: p1, p2, ... pn are used for the line estimation ax * by + c = 0 (or y = kx + t). A weighting factor w1, w2, ... wn can be assigned to each of the clean points. In this case, the line estimate can be considered equivalent to a minimization problem that is defined by:
The minimum sum of squares in equation [5] can be found by setting the gradient to zero. Since the model contains two (2) parameters, there are two (2) gradient equations. The minimization of the above equation can be performed using the following examples of pseudocodes: function line - weightedJeast_squreJorJine (x, y, weighting); party = sum (weighting. * x. * y) * sum (weighting (:)); part2 - sum ((weighting. * x)) * sum ((weighting. * y)); part3 = sum (x. Λ2. * weighfing) * sum (weighting (:)); part4 = sum (weighting. * x). A2; beta = (part 1-part2) / (part3-part4); alpha = (sum (weighting. * y) -beta * sum (weighting. * x)) / sum (weighting); a - beta; c = alpha; b = -1; line = [a b c];
A weighting factor can be assigned to each own point using a weighting function:
. where dlsi is defined as the distance between the own point and an expected baseline. Therefore, a higher weighting factor can be assigned to the own point, if the own point is read close to the expected text baseline and vice versa. An iterative procedure can be used to get closer to the expected text baseline. In an exemplary implementation, the iterations can be performed for a predetermined number of cycles (for example, about 10-70 cycles) or until the difference between two successive line angles is less than a small threshold ( for example, about 0.01 degree).
The estimated lines can be further refined by eliminating outliers (outliers) in the clean points group. Outliers can be identified, for example, using a Gaussian model. According to the Gaussian model, most clean points (eg, about 99.7%) can be within three standard deviations. Therefore, if a proper point is located beyond the three standard deviations, the own point can be considered as an outlier. The remaining clean points in the point group can then be used for line estimation with the conventional least squares method. The above-mentioned line estimate above can then be made for all groups of clean points. Figure 11 illustrates an example of a portion of text for which baselines are estimated. It can be seen that the groups of clean points are shown as being connected to a line. An example of a line is highlighted in 1102. In step 210, the horizontal vanishing point can be identified using the estimated text baselines. According to the homogeneous coordinate theory, each horizontal line in the Cartesian coordinate system can be considered as a data point in the homogeneous space and a line that passes through these data points corresponds to a vanishing point. Thus, the identification of the horizontal vanishing point can be considered as a line adjustment problem in the homogeneous coordinate system.
Although the estimated text baselines are carefully estimated, some text baselines may contribute to the windows in terms of estimating the vanishing point. Some outlier data points can be eliminated to improve the estimate of the horizontal vanishing point. Outliers can be obtained due to imprecise line estimation, non-text components (for example, in cases where the separation of text and photos fails), distortions and the like. To overcome this problem, according to one embodiment, a method based on the RANSAC (Random Sample Consensus) algorithm as described in Martin A. Fischler and Robert C. Bolles "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography ". Comm. of the ACM 24 (6): 381-395, June 1981, is used for the identification of the horizontal vanishing point. The RANSAC-based algorithm is selected because of its robustness in eliminating outliers when estimating model parameters. The proposed RANSAC-based algorithm differs from the conventional RANSAC algorithm in that initial data points are selected for estimation of model parameters and confidence levels can be taken in conjunction with them. Unlike the random selection of initial data points in the conventional RANSAC algorithm, the proposed RANSAC-based algorithm selects initial samples that have the highest confidence.
An example implementation of the proposed RANSAC-based algorithm will now be described below.
In one embodiment, each of the estimated text baselines may be defined in a Cartesian coordinate system. Each of the text baselines defined in the Cartesian coordinate system can be transformed into a data point in a homogeneous coordinate system.
Confidence levels can be assigned to each of the data points. The confidence levels for the data points can be determined based on the proximity of the own points used to estimate the text baseline of the resulting text baseline and the length of the text baseline concerned. The confidence level for each horizontal text baseline can be defined as:
where smax and Sm represent the maximum and minimum standard deviation of all segments of n rows; mm represents the longest line segment of all n lines. Therefore, a higher confidence level is assigned to a longer baseline of horizontal text. What is based on the assumption that longer is the baseline of text, the better the estimate of the baseline of horizontal text. Similarly, the lower the standard deviation (an indicator of the proximity of the proper points of the corresponding estimated text baseline), the better the estimate of the text baseline. As a result, high levels of trust are attributed to such text baselines. The data points in the sample points having confidence levels above a predetermined threshold may be grouped in a priority sample array. The data points in the priority sample array can be clustered into a number of sample groups. In one embodiment, each sample group may comprise two or more data points. For line estimation, accuracy can also be determined by the distance of the data points that are used to estimate the line. If two data points are far apart, then there is greater confidence that the line estimate is correct. Therefore, a second confidence level indicator can be assigned to the pair of points in the sample group:
where DiSj.k is the distance between the line j and the line k in the vertical direction and Disma * is the maximum distance among the m * (m-1) line pairs. A selection of m (m "n) lines can be considered to formulate the priority sample groups to select the first m rows that have the best confidence levels. A group confidence value can be assigned to each sample group based on at least the confidence level assigned to each data point in the sample group. In step A, the sample groups of data points can be selected iteratively. in the priority sample matrix for line adjustment. Iteration can begin with the sample group with the highest confidence value in the priority sample matrix. (If the iteration time exceeds a certain threshold, then it can be stopped and the algorithm goes to step F). In step B, the line adjustment can be performed for the first sample group resulting in a first adjusted line and then the line adjustment can be performed for each other sample group resulting in other adjusted lines. In step C, a set of data points that are positioned below a predetermined distance threshold from the first adjusted line can be determined based on the first and the other adjusted lines. These data points are called "inliers" (valid data). The predetermined distance threshold from the first adjusted line may be a fixed parameter or may be adaptively adjusted based on the content of the document image. In step D, the count of data points that are positioned below a predetermined distance threshold from the first adjusted line is calculated. The maximum number of Inliers determined is recorded. In step E, a check can be made to determine if the maximum number of inliers is equal to the number of data points. If the maximum number of inliers is not equal to the number of data points, the iteration time can be recalculated and step A can be restarted. If the maximum number of inliers is equal to the number of data points, step F can be started. In step F, the maximum number of inliers can be used to estimate vanishing points. In one embodiment, first and second horizontal vanishing point candidates may be estimated using different approximation methods selected from the group consisting of a least squares method, a weighted least squares method, and a least squares method. adaptive. The use of alternative methods of approximation is also considered in this paper. In step G, the horizontal vanishing point candidate that is closest to the horizontal text direction in the image document after projective correction can be selected. The proximity of the horizontal text direction can be measured by:
where n is the number of horizontal lines in the document image, ai is defined as the angle of the second line with respect to the horizontal direction (180 ° ä 0 / ä 0 °) after the projective correction has been performed, and p is the index of the pth candidate horizontal vanishing point selected from the m candidate vanishing points. The conventional RANSAC algorithm randomly uses the points selected for the initial line estimate. As a result, there may be different results each time the conventional RANSAC algorithm is executed. In addition, it may be difficult to judge the results of the conventional RANSAC algorithm. The proposed algorithm based on RANSAC overcomes this problem by incorporating some prior knowledge about these points. In the proposed RANSAC-based algorithm, peaks that have good confidence levels are selected first to estimate inliers. As a result, the proposed RANSAC-based algorithm provides more consistent results.
Although the disclosure describes the use of clean points for the determination of horizontal vanishing points, it should be known that other pixels determining the position of the pixel blob can also be used for the determination of horizontal vanishing points.
Figure 3 depicts an example of a clean peak clustering algorithm 300, according to one embodiment. In step 302, a clean tip set T can be identified. In step 304, the clean peaks can be counted to determine if the number is sufficient to create a cluster of clean points. If the number is greater than sufficient (at least above a threshold number (Tiw)), the set of clean points "/" can be processed The threshold number can be set as a constraint for group creation If the number of clean points is less than a threshold, then step 324 can be performed In an exemplary implementation, the threshold number of clean points can be 10 suggesting the presence of at least 2 or 3 words in a single line, the threshold can be set to prevent a possibility of assigning clean points without any relation to the own peak group, In step 306 a clean point (eg, p0) is selected from in the set of clean points / .The own point p0 may be entered as a first own point in a candidate line group C. In one embodiment, the candidate line group C may be a bidirectional queue. Then the clean point p0 is removed from the set of clean points /, the clean points from one side of the p0 are entered in the candidate line group C. At step 308, the newly joined clean point pt from the clean-point group C is selected on one side of the bidirectional queue (for example, the queue in the non-negative direction i> = 0). An own point p * of the set of clean points / which is closest to the own point p is identified. In step 310, the distance between the own point p (and p * is calculated, if the distance is less than the threshold distance (Tâ), step 312 is performed If the distance is greater than the threshold distance ( Τ "), the step 314 is performed The threshold distance can refer to a maximum distance between the own points that will be in a group In an implementation example, the threshold distance between the group's own points is lower at a first distance threshold which may be equal to 3 times the median distance of the set nearest to neighboring proper points, at step 312 it is determined whether the selected own point p * satisfies the constraints imposed by the point-to-line distance threshold (7y and the proximity of the horizontal direction threshold (Ta) .The point-to-line distance threshold (Τι) can set the threshold of the maximum distance of the point from the text baseline for a clean point to selection The point-to-line distance threshold (T,) is used to select the proper points that help formulate a straight line. The proximity of the horizontal direction threshold (Ta) can define the maximum angle of the own point of the line with respect to the horizontal direction for the own point to be selected for the group of clean points. The proximity of the horizontal direction threshold (Te) is used to select the clean points that contribute to the formulation of the direction of the line near a horizontal direction. In an exemplary implementation, the T "may be twenty (20) degrees. In response to the determination that the proper point p * satisfies the constraints, the proper point p * can be selected for the candidate line group C as the point Pm in the bidirectional queue (in the non-negative direction) and i = i +1 meanwhile. In response to the determination that the own point p * selected does not satisfy the constraints, it may be placed in a special line group "L".
The process steps 308 to 312 are performed until all clean points from said one side (the non-negative direction in the bidirectional queue) are evaluated. In response to completing the evaluation of said one side of the clean points, the remaining clean points from the other side of the po are taken into consideration (the non-positive direction of the bidirectional queue). The remaining clean points from the other side of the p0 are entered in the candidate line group C. In step 314, an own point p} (the non-positive direction of the bidirectional queue, j <= 0) of the group candidate lines C is selected from another side. An own point p * from the set of clean points /, which is closest to the own point Pj coming from the other side in the group of clean points C is identified. In step 316, the distance between the own points pt and p is calculated. If the distance is less than 7 * step 618 is performed. If the distance is greater than 7 * step 320 is performed. In step 318, the own point pj is checked to determine whether the selected own point p * satisfies the constraints with respect to T, and Ta. In response to the determination that the proper point p} satisfies the constraints, the proper point p * can be selected for the candidate line group C as the point pj-1 in the bidirectional queue (in the non-positive direction) and j = j-1 in the meantime. In response to the determination that the own point does not satisfy the constraints, the own point p can be placed in a special line group "L".
Process steps 316 to 318 are performed until all clean points from the other side are evaluated. In step 320, the clean points in the candidate line group C can be counted to determine if the number is greater than a threshold number Tm. If the number is greater than Tm, step 322 is performed. If the number is less than Tm, the process is returned to step 304 to determine if there are other clean points to process. In step 322, an index number is assigned to the candidate line group C so that the candidate line group C becomes a dot matrix for a line indexed by the index number. In step 324, for each own point in the special line group L, it is checked whether the own point is within the constraints of Tm, Tj and
Ta for any of the groups of lines. In response to the determination that the own point is within the constraints Tm, 7} and Ta, the eigenpoint is melted in the corresponding group of lines.
The process is repeated for each text baseline until all lines in the document image are processed.
An advantage of the clean-point clustering algorithm as described in this document is that it provides consistent clustering results regardless of the initial points for clustering. Using the bidirectional queue allows the use of two endpoints on one line instead of one endpoint in one direction, thus reducing the dependence of the algorithm on the seeding point to formulate the group of points. The clean point clustering algorithm is flexible in that the algorithm does not require each clean point to belong to one of the groups of points. Some clean points that do not fit into any of the groups are eliminated or ignored. This results in easier and faster convergence of the proposed clean cluster clustering algorithm than conventional clustering algorithms. However, the use of conventional algorithms or any other clustering algorithm to cluster clean points in different groups of rows is also considered in this paper.
Figure 4 depicts an exemplary flowchart 400 for identifying the vertical vanishing point using margin feature points, according to one embodiment. In step 402, the margin feature points can be identified. Margin feature points may be position-determining pixels, according to one embodiment. Margin feature points can be identified as described below. In one embodiment, the margin feature points may be a left lower final pixel of the pixel blobs for the left margin, and the margin feature points may be a lower right pixel pixel blob for the right margin. . The lower left end points can be identified by finding an associated blob at the left own point in groups of clean points (eg identified during estimation of horizontal lines). The groups of clean points determined after the merging point step and before the use of dot groups for horizontal line formulations can be used for the determination of margin points. The reason being that after the merging of own points, the left or right own point may correspond to the margin formation block. Moreover, no clean point can have been eliminated just before the lines are formulated. The left own point can be found after comparing the x coordinate of the own points in the group. The corresponding blob of the left own point can be found. The lower left end point of the blob can be used as a left margin characteristic point. Similar to the lower left endpoint, lower right endpoints can be identified by finding a blob associated with the right own point in groups of clean points. After having identified the blob on the right end of the group of clean points, it is possible to determine if there are neighboring blobs near the identified right Anal blob. A blob search is then performed using a process similar to the process used in the neighbor blob search algorithm in the blob procedure. The lower right end points corresponding to the found blobs are then used to formulate the characteristic peaks for the right margin line estimate. In alternative embodiments, other variations of the margin feature points may be used. Figure 12 illustrates an exemplary image with margin feature points identified at the margin. It can be seen that the margin feature points are marked with margin points as shown in 1202. Paragraph margins are generally vertical and parallel, if no projective distortion occurs. In step 404, margin point characteristics are clustered in different margin groups. Margin feature points along the margin lines of the document in the image can be used to estimate margins. In one embodiment, the margin feature points may be clustered based on the proximity of the pixel blobs in the corresponding margins. In an exemplary embodiment, a clustering algorithm similar to the clean tip clustering algorithm described in connection with FIG. 3 can be used to cluster the margin feature points. In an alternative embodiment, a different endpoint clustering algorithm may be used as described below. Step 1: Set the margin point characteristic distance threshold TEndm, and all identified left margin points (in step 402) are indicated as {P,}; Step 2: Initialize the group of left margin points {C ») with a randomly selected point in {P,}, eliminate this point from {P,}, set groupjndex ~ 1; Step 3: For each point {P,}, calculate the minimum distance between this point and the points in {Q (groupjndex → ai). If the distance is less than TEndm, then this point will be assigned to the group of points that reaches the minimum distance; otherwise the group index will be increased by 1; groupjndex = groupjndex +1, and this point will be assigned to the most recent left margin point group: {Cgroupjndex}. TEndm is set to be 6 * (7d) ((7¾ is the median distance between clean points as previously discussed in relation to Fig. 2), and this value can be selected to be sufficient to search for point characteristics neighboring margin should be in the same margin line The left end point clustering process may be different from the clean point clustering process for estimating horizontal lines, since in final left tip cluster can use all margin points, while in the clean point clustering algorithm, some clean points can be eliminated during the clustering process.
In alternative embodiments, other clustering algorithms may also be used. The clustered position-determining pixels identified at the margins can be processed in different groups of margin points. For example, if there are two columns in a document image, the position-determining pixels for the left margins and the right margins of the two columns are identified and grouped accordingly. In step 406, overclassed margin lines can be consolidated with corresponding margin lines. For example, two or more lines along the same margin can be consolidated into a single margin. In step 408, the estimation of vertical lines can be performed using the margin point groups. In a manner similar to the clean-dot clustering algorithm, all margin-point groups can not be used for estimating vertical lines. The margin characteristic pixel for the group may have to meet one or more of the following conditions to qualify for margin line estimation: A minimum number of points in the margin line P "(for example, the threshold for P, h can be 3 clean points), a minimum percentage of points Pt on the margin line (for example, about 50%), a maximum angle of one line with respect to the vertical direction av (for example, the maximum angle may be about 20 °) and the minimum confidence level Pb for non-edge points (for example, the minimum for non-edge points may be about 50% ).
The characteristic of the margin points (which contributes to P #,) can be considered as being in the margin line, if the distance between the pixel determining point and the margin line is within a threshold (T ,), which is equal to the median distance of the clean points (Fd) in an implementation example. The percentage of points on the margin line Pt can be defined as the ratio between the number of clean points in the margin line and the number of margin point characteristics in the group of clustered clean points. In some embodiments, there may be pixel determining points that are out of range. For example, when the content of the document is captured in part, the edge of the image may have content that is half captured. Pixel determining points associated with such edge blobs may be defined as edge points. Edge points can not be used in the margin line estimate, and the percentage of non-edge points can be defined as the ratio between the number of non-edge points and the number of points that do not belong to the edge. margin point characteristics in the cluster margin feature group. The minimum confidence level of points that do not belong to the edge Pb.is defined as the multiplication of the percentage of points on the margin line and the percentage of points not belonging to the edge.
In one embodiment, the estimation of vertical lines may be performed using the perpendicular deflection least squares method, although alternative methods are also contemplated herein. Suppose an almost vertical potential line is expressed as y = kx + t. With the perpendicular deflection least squares method, the optimal line coefficients correspond to the following object minimization function:
Based on the perpendicular deflection least squares method, a robust iterative method for estimating near vertical lines as described below may be employed in one embodiment. In step 1, a line is initialized using the perpendicular deflection line estimation method. In step 2, the distance from sample points is calculated. In step 3, the line function can be recalculated based on the weighted perpendicular deflection method. In step 4, the angle difference between the estimated successive lines can be calculated. If the angle difference is less than a predefined threshold or if the iteration count exceeds the maximum allowable iterations, the method goes to step 5. If the angle difference is greater than the predefined threshold or the The iteration is within the limits of the maximum allowable iterations, the next iteration is performed by going to step 2. In step 5, the line function is calculated. The predefined threshold and the maximum allowable iteration time are the same values as the respective parameters in the horizontal line estimation method, according to one embodiment. As a variant, different values are used for the predefined threshold and the maximum permissible iterations for estimating vertical lines from those used for estimating horizontal lines. The weighted perpendicular deflection method can be implemented using the following examples of pseudo codes: function line = estimate_line_ver_weighted (pt_x, pt_y, w); % pt_x x coordinate% pt_y y coordinate% w weighting factor PLx = PLx (:): PU = pt_y (:); w - w (.j;% step 1: caiculate n η = sum (w (:));% step 2; caiculate weighted coordinates y_square = pt_y (:). * pt_y (.j; xjsquare = pt_x (:). * pt_x {:); x_square_weighted = x_square. * w; y_square_weighted = y_square. * w; xjwelghted = pt_x. * w; y_weighted = pt_y. * w;% step 3: caiculate the formula
Bjjpleft = sum (y_square_weighted) -sum (y_weighted) .h2 / n;
Bjupright = sum (x_square_weighted) -sum (x_weighted). * 2 / n;
Bjdown = sum (x_weighted (:)) * sum (y_weighted (:)) / n-sum (x_weighted. * Pt_y); B - 0.5 * (B_upleft ~ B_upright) / B_down; % step 4: caiculate b if B <0 b = -B + sqrt (B. * 2 + 1); else b = -B-sqrt (B.A2 + 1); end% Step 5: caiculate a = (sum (y_weighted) -b * sum (x_weighted)) / n; % Step 6: the modelisy = a + bx, and now we transform the model to% a * x + b * y + c = 0; c_ = a; a_ = b; b_ = -1;
In another embodiment, the estimation of vertical lines can be performed using a changeable x-y weighted least squares method. In the changeable x-y least squares method, the x and y coordinate can be exchanged before estimating the vertical line so that the vertical offset will be constrained during the estimation of vertical lines.
Once the vertical lines are estimated, the vertical lines can be merged. For example, multiple margin lines interrupted along a line space can be merged to form a single margin. Vertical lines can be merged using the following steps. In step 1, for each margin line, the x coordinate can be calculated keeping the vertical coordinate (y coordinate) fixed. In step 2, the x coordinate distance can be calculated for the margin lines. If the distance of the x coordinates is less than a threshold T "#, the margin lines can be merged. Ty, h can be chosen to be 2 * (Td), Td being the median distance between characteristic points of margin. In cases where there are multiple vertical lines, the nearest vertical lines can be merged before being used to identify vertical vanishing points. Fig. 13 illustrates an exemplary image showing two estimated vertical lines 1302A and 1302B along the same margin. Fig. 14 illustrates an exemplary image showing the merging of the estimated vertical lines into a single margin 1402 of Fig. 13. In step 410, the vertical vanishing point can be identified using the estimated vertical lines. The estimated vertical lines can be processed using a modified RANSAC algorithm as described below, which is very similar to the method used for the identification of horizontal vanishing points. The estimated vertical margin lines resulting from the melting step can be defined in a Cartesian coordinate system. In addition, each of said estimated vertical margin lines is transformed from the Cartesian coordinate system into a data point in a homogeneous coordinate system. A confidence level can be assigned to each of the data points based on the proximity of the margin points used to estimate the resulting margin lines as well as the length of the respective margin lines as was done with the identification of horizontal vanishing points. A set of data points among the data points having a confidence level greater than a predetermined threshold are grouped in a priority sample array. In addition, the data points in the priority sample array are clustered into a number of sample groups. In one embodiment, each of the sample groups comprises two or more data points. In addition, a group confidence value can be assigned to each sample group on the basis of the confidence level assigned to each data point in the sample group. Sample groups of data points can be selected iteratively in the priority sample matrix for line adjustment. In one embodiment, the iteration can begin with the sample group having the highest confidence value in the priority sample array. Line adjustment can be performed for the first sample group with a result of a first adjusted line. Line adjustment can then be performed for each other sample group resulting in other adjusted lines. A set of data points that are positioned below a predetermined distance threshold from the first adjusted line can be determined based on the first and the other adjusted lines. First and second vertical vanishing point candidates may be estimated from the vertical lines corresponding to the determined set of data points. In one embodiment, the first and second horizontal vanishing point candidates can be estimated using different approximation methods such as a least squares method, a weighted least squares method, and an adaptive least squares method. Other methods of approximation can also be used. The proximity of each candidate vertical vanishing point can be compared to the resulting vertical text direction after projective correction. The vertical vanishing point candidate that is closest to the vertical text direction of the image document after projective correction can be selected.
If the number of margin lines detected is relatively small (less than 5, for example), it is also possible to calculate the vanishing point directly using the method of identifying weighted vertical vanishing points. With this method, each of said estimated vertical margin lines is transformed from the Cartesian coordinate system to a data point in a homogeneous coordinate system. A confidence level can be assigned to each of the data points as mentioned above. After that, the weighted least squares method can be used to adjust the line that corresponds to the vertical vanishing point.
Figure 5 depicts an exemplary process 500 for identifying the vertical vanishing point using the related component analysis, according to one embodiment. The process 500 may be used in cases where vertical margin lines may not be available due to the absence of margins. The vertical vanishing point can be identified by using the text line characteristics of pixel blobs, which is the text character building unit. In step 502, pixel line feature characteristics of pixels can be identified. Fig. 15 illustrates an exemplary image showing the identification of text feature characteristics of a character. A portion of text identified by a circle 1502 is shown on the right side of the figure. The 1504 vertical text feature characteristics of the letters "in the" are identified and shown. In step 504, a set of pixel blobs can be identified with text line features conforming to one or more defined criteria. In one embodiment, a pixel blob may be selected, if the pixel blob satisfies one or more of the criteria: eccentricity of the pixel blob 0.97, not proximal to the margin, angle of the text line between 70th and 110 ° and pixel blob area within [0.3, 5] * aream. Eccentricity can be used to indicate how close the pixel blob is to a circle shape. Since the eccentricity of a circle shape is zero, the smaller the eccentricity value, the more circular the pixel blob. If the eccentricity of a pixel blob is greater than 0.97, the pixel blob may be a distorted blob that looks like a line segment and therefore may indicate vertical distortion. In one embodiment, the eccentricity of the pixel blob can be found by identifying the surrounding ellipse around the pixel blob and then computing it according to the following formula:
where a and b represent the half major axis and the half minor axis of the ellipse. For languages such as Chinese and Russian, an optional preprocessing procedure such as edge detection or mathematical morphological filtering may be used to increase the eccentricity characteristics of the pixel blob. Blobs of pixels having 0.97 can be filtered using an appropriate filter. The pixel blob near the edge of the image can not be used for estimation. In one embodiment, proximity filtering can be used to eliminate blobs of pixels that intersect image edges. . Similarly, in one embodiment, angle filtering may be performed to filter blobs of pixels having text strokes that are not between 70 degrees and 110 degrees. Blobs of pixels having an area in the range [0.3, 5] * aream can be chosen. To identify blobs in such a range, a robust method can be used to estimate the median areas of the pixel blobs that are selected after filtering the aforementioned criteria. Pixel blobs with area values in the range of [0.3, 5] * aream are used to estimate vertical vanishing points. Figure 16 illustrates an exemplary image showing selectively extracted blobs after identifying text trait characteristics.
The selected pixel blobs are used to estimate vertical text blob lines. The vertical lines are estimated at step 506. The vertical lines are estimated using a line function that may correspond to the direction of the pixel blob. Fig. 17 illustrates an exemplary image showing vertical text blob lines for selected pixel blobs. In step 508, the vertical vanishing point can be identified using the estimated vertical lines. In one embodiment, the vertical vanishing point can be determined using the modified RANSAC algorithm as described above. Figure 18 shows an example of an image showing the vertical text blob lines selected as a result of the application of the modified RANSAC algorithm. In short, a brief explanation summarizing the application of the modified RANSAC algorithm on vertical lines is provided below. Each of said estimated vertical text blob lines are defined as lines in a Cartesian coordinate system. In addition, each of said estimated vertical textblob lines are transformed in the Cartesian coordinate system into a data point in a homogeneous coordinate system. A level of trust can be assigned to each of the data points. The confidence level may be based on at least the eccentricity of the shape of the pixel blob used to estimate the vertical line of text concerned. In addition, the modified RANSAC method is applied as described above in connection with the above figures to determine the vertical vanishing point. The projective correction algorithm may be implemented as a set of computer instructions which, when loaded onto a computing device, provide a machine for implementing the functions described therein. These computer program instructions may also be stored in a computer-readable non-temporary memory that may instruct a computer or other programmable data processing apparatus to operate in a described manner. The projective correction algorithm can also be implemented as hardware or as a combination of hardware and software that can be implemented in or with computer-controlled systems. Those skilled in the art may realize that a computer-controlled system includes an operating system and various server / computer-related support software. The projective correction algorithm as described herein may be deployed by an organization and / or an independent provider associated with the organization. The projective correction algorithm may be a stand-alone application residing on a user device, or a modular application {eg, an extension module (plugin) that may be integrated with other applications such as processing applications image, OCR applications and the like. For example, the stand-alone application may reside on user devices such as a personal computer, a laptop, a laptop, a mini-laptop, a tablet computer, a smartphone, a camera digital, a video camera, a mobile communication apparatus, a personal digital assistant, a scanning digitizer, a multifunction device or any other device capable of obtaining document images and having a processor for performing operations described herein. document. In another implementation, part of the projective correction algorithm may be executed by a user device (for example, the user's camera) and the other part of the projective correction algorithm may be performed by a processing device (e.g., the user's personal computer) coupled to the user device. In this case, the processing device can perform more expensive tasks in computing time. The projective patching algorithm may also be implemented as a server application residing on a server (eg, a ROC server) accessible from user devices through a network. The projective correction algorithm can also be implemented as a network application having modules implemented in multiple networked devices.
To summarize, this disclosure provides various embodiments of methods for projectively correcting images distorted by perspective, for example camera-based document images, which have at least one of the following technical contributions: - use of clean points to estimate the horizontal vanishing point. In general, it is preferred to use one of the pixels in the baseline of the bounding rectangle as position determining pixels, since these baselines are mostly aligned for multiple successive characters in a text portion. Of these, the proper points are preferred since they are a by-product of the standard component-related analysis and therefore no additional processing step is required to obtain them for each pixel blob. - A clean point selection procedure is proposed to select clean points that can be used for estimating lines of text. Embodiments that eliminate confusing clean points and group the remaining clean points by clustering or merging have been disclosed. In addition, the result of clustering clean points is already the estimated baseline. - The left end point and the right end point of the text part baselines are used as margin points for margin line estimation. An algorithm for clustering left and right endpoints is proposed to estimate margin lines. - An adaptation of the conventional RANSAC algorithm, which could be called RANSAC priority, is proposed to identify inliers in the estimation of vanishing points, the conventional algorithm being improved taking into account a prior knowledge, by example values or confidence levels. - A vanishing point selection program is adopted to select from a number of candidate vanishing points that can be determined in different ways. - A weighted line estimate is proposed for the estimation of horizontal vanishing points using confidence levels and an adaptive weighted line estimate is proposed for the estimation of vertical vanishing points. - A perpendicular deflection least squares method and a changeable xy weighted least squares method are proposed to compute the vertical margin lines, - An estimation of vertical vanishing points based on blob analysis is proposed, especially in taking into consideration vertical line characteristics of pixel blobs. - A page analysis is embedded in the process chain and only the Textual Information is used for the projective correction. Embodiments in which steps are taken to eliminate or separate photos before performing the projective correction are provided. - A complete processing chain to solve the problem of projective correction is proposed, in which the intervention of the user can be avoided. A projective correction method is proposed, which includes elimination steps at different levels, namely clean point, baseline and vanishing candidate, to collectively improve the results of the projective correction.

权利要求:
Claims (16)
[1]
claims
A method of projectively correcting an image containing at least a text portion that is distorted by the perspective, the method comprising the steps of: image binarization, wherein said image is binarized; component-related analysis, wherein blobs of pixels are detected in said at least one text portion of said binarized image and wherein for each of said pixel blobs, a position-determining pixel is selected on a baseline of the pixel blob said position-determining pixel defining the position of the pixel blob in the binarized image; determining horizontal vanishing points, comprising the steps of: estimating text baselines using said position-determining pixels of said pixel blobs, identifying candidate horizontal vanishing points from said estimated text baselines and determining a horizontal vanishing point of said at least one text part by means of said horizontal vanishing point candidates; determining vertical vanishing points, wherein a vertical vanishing point is determined for said at least one text portion based on vertical characteristics thereof; and projective correction, wherein said perspective in said image is corrected on the basis of said horizontal and vertical vanishing points; wherein said determination of horizontal vanishing points comprises a first step of eliminating at said position-determining pixels, a second step of eliminating at the text baselines and a third step of eliminating at the candidate points horizontal leakage.
[2]
The method of claim 1, wherein said position-determining pixels are self-points of said pixel blobs.
[3]
The method according to claim 2, wherein said first step of eliminating comprises the step of detecting confusing clean points that are misaligned with respect to own points in the vicinity of the own point under consideration and wherein said confusing clean points are neglected for said baseline estimate of text.
[4]
The method of claim 3, wherein said confusing clean points are detected by the steps of: determining the width and height of the pixel blobs; determining average values for the width and height of pixel blobs; and detecting said confusing clean points as clean spots belonging to pixel blobs of which at least one of the pixel blob width and height considered differs in a predetermined measure from said calculated average values.
[5]
The method of claim 2, wherein said text baseline estimation step comprises a step of clustering clean points into groups of clean points, wherein said clean point groups fulfill at least one of the following conditions: - a point-to-point distance between the group's own points is less than a first distance threshold, - a point-to-line distance between each group's own point and a line formed by the group's own points is less than one second distance threshold, - an angle of the line formed by the proper points of the group to the horizontal is less than a maximum angle, and - the group of clean points contains a minimum number of clean points; and wherein said text baselines are estimated based on said groups of clean points.
[6]
The method of claim 5, wherein said first distance threshold, said second distance threshold, said maximum angle and said minimum number of own points are adaptively adjusted based on the content of the image.
[7]
The method of claim 5, wherein said text baseline estimation step further comprises a step of merging groups of clean points in which groups of clean points on both sides of a neglected clean point are merged into a larger group of clean points.
[8]
The method of claim 1, wherein the second eliminating step comprises the steps of: assigning confidence levels to said text baselines, and eliminating text baselines based on said confidence levels.
[9]
The method of claim 8, wherein said confidence levels are determined based on at least the length of the text baseline concerned and the proximity of the group of clean points used to estimate the text baseline. and the resulting text baseline.
[10]
The method of claim 8, wherein said eliminating text baselines is performed by means of a RANSAC algorithm in which said confidence levels are taken into account.
[11]
The method of claim 1, wherein the third removing step comprises: performing a projective correction based on each identified horizontal vanishing point candidate; comparing the proximity of each candidate horizontal vanishing point to the resulting horizontal text direction after projective correction; and selecting the candidate horizontal vanishing point that is closest to the horizontal text direction of the image document after projective correction.
[12]
The method of claim 1, wherein a first and a second horizontal vanishing point candidate are estimated from said text baselines after said second elimination step and wherein said estimate of said first and second vanishing steps is used. second candidate vanishing points of different approximation methods selected from the group consisting of: a least squares method, a weighted least squares method and an adaptive least squares method.
[13]
The method of claim 1, wherein a step of separating the text and illustrations is performed after said image binarization and prior to said related component analyzes, and only the textual information is kept in said binarized image.
[14]
A system for projectively correcting an image containing at least a text portion that is distorted by the perspective, the system comprising at least one processor and associated storage containing a program executable by means of the at least one processor and comprising: first software code portions configured for image binarization, which binarize, when executed, said image; second pieces of software code configured for the component-related analysis, which detects, when executed, blobs of pixels in said at least one text portion of said binarized image and selects for each of said pixel blobs a determining pixel positional position on a baseline of the pixel blob, said position determining pixel defining the position of the pixel blob in the binarized image; third pieces of software code configured for the determination of horizontal vanishing points, which perform, when executed, the steps of: estimating text baselines by means of said position-determining pixels of said pixel blobs, identifying horizontal vanishing points from said estimated text baselines and determining a horizontal vanishing point of said at least one text portion by said horizontal vanishing point candidates; fourth pieces of software code configured for the determination of vertical vanishing points which, when executed, determine a vertical vanishing point for said at least one text part based on vertical characteristics thereof; fifth pieces of software code for the projective correction, which correct, when executed, said perspective in said image based on said horizontal and vertical vanishing points; wherein said third pieces of software code perform, when performed, a first step of eliminating at said position-determining pixels, a second step of eliminating at the text baselines and a third step of elimination at the level of candidates horizontal vanishing points.
[15]
The system of claim 14, comprising one of the following: a personal computer, a laptop, a laptop computer, a portable mini-computer, a tablet computer, a smartphone, a digital camera, a video camera , a mobile communication device, a personal digital assistant, a scanning digitizer, a multifunction device.
[16]
A non-temporary storage medium on which is stored a computer program product comprising pieces of software code in executable format on a computing device and configured to perform the following steps when executed on said computing device; image binarization, wherein said image is binarized; component-related analysis, wherein blobs of pixels are detected in said at least one text portion of said binarized image and wherein for each of said pixel blobs, a position-determining pixel is selected on a baseline of the pixel blob said position-determining pixel defining the position of the pixel blob in the binarized image; determining horizontal vanishing points, comprising the steps of: estimating text baselines by means of said position-determining pixels of said pixel blobs, identifying candidate horizontal vanishing points from said estimated text baselines and determining a horizontal vanishing point of said at least one text part by means of said horizontal vanishing point candidates; determining vertical vanishing points, wherein a vertical vanishing point is determined for said at least one text portion based on vertical characteristics thereof; and projective correction, wherein said perspective in said image is corrected on the basis of said horizontal and vertical vanishing points; wherein said determination of horizontal vanishing points comprises a first step of eliminating at said position-determining pixels, a second step of eliminating at the text baselines and a third step of eliminating at the candidate points horizontal leakage.

类似技术:

公开号 | 公开日 | 专利标题

BE1022636B1|2016-06-22|METHOD AND METHOD OF CORRECTING PROJECTIVE DISTORTIONS WITH MULTIPLE LEVEL ELIMINATION STEPS

BE1022630B1|2016-06-21|Method and system for determining candidate vanishing points for projective correction

BE1022635B1|2016-06-22|METHOD AND SYSTEM FOR CORRECTING PROJECTIVE DISTORTIONS USING OWN POINTS

US9275281B2|2016-03-01|Mobile image capture, processing, and electronic form generation

US8712188B2|2014-04-29|System and method for document orientation detection

Toh et al.2009|Automated fish counting using image processing

EP2491532A1|2012-08-29|Method, computer program, and device for hybrid tracking of real-time representations of objects in image sequence

FR3009518A1|2015-02-13|

US9418316B1|2016-08-16|Sharpness-based frame selection for OCR

US8249377B1|2012-08-21|Blurred digital image deblurring

FR3081248A1|2019-11-22|SYSTEM AND METHOD FOR DETERMINING A LOCATION FOR PLACING A PACKET

JP6542230B2|2019-07-10|Method and system for correcting projected distortion

BE1025502A1|2019-03-20|SYSTEM AND METHOD FOR FORM RECOGNITION USING GABOR FUNCTIONS

Mannan et al.2016|What is a good model for depth from defocus?

BE1026159A1|2019-10-22|IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD

WO2017106084A1|2017-06-22|Focus detection

US9995905B2|2018-06-12|Method for creating a camera capture effect from user space in a camera capture system

TW201421990A|2014-06-01|Device, method and computer readable storage medium thereof for detecting object

WO2021109656A1|2021-06-10|Two-dimensional code size determination method, two-dimensional code display method, apparatuses and devices

FR2946444A1|2010-12-10|METHOD AND APPARATUS FOR CALIBRATING AN IMAGE SENSOR USING A REAL TIME SYSTEM FOR TRACKING OBJECTS IN AN IMAGE SEQUENCE

BE1021013B1|2014-12-11|METHOD AND SYSTEM FOR IMPROVING THE QUALITY OF COLOR IMAGES

BE1023388B1|2017-03-01|Method and system for correcting an image from a hand scanner

BE1024836A9|2018-08-21|METHOD FOR EVALUATING THE QUALITY OF AN IMAGE OF A DOCUMENT

BE1023380B1|2017-03-01|Method of processing information from a hand scanner

EP2546801B1|2015-01-07|Method for detecting and correcting the direction of a document in a digital image

同族专利:

公开号 | 公开日

BE1022636A1|2016-06-22|

US8811751B1|2014-08-19|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

EP1276074A2|2001-07-09|2003-01-15|Xerox Corporation|Method and apparatus for resolving perspective distortion in a document image and for calculating line sums in images|

WO2006130012A1|2005-06-02|2006-12-07|Lumex As|Method, system, digital camera and asic for geometric image transformation based on text line searching|

US20080260256A1|2006-11-29|2008-10-23|Canon Kabushiki Kaisha|Method and apparatus for estimating vanish points from an image, computer program and storage medium thereof|

US20080226171A1|2007-03-16|2008-09-18|Fujitsu Limited|Correcting device and method for perspective transformed document images|

CN101520852A|2008-02-29|2009-09-02|富士通株式会社|Vanishing point detecting device and detecting method|

US7372550B2|2005-10-05|2008-05-13|Hewlett-Packard Development Company, L.P.|Measuring distance using perspective|

US7330604B2|2006-03-02|2008-02-12|Compulink Management Center, Inc.|Model-based dewarping method and apparatus|

US8244062B2|2007-10-22|2012-08-14|Hewlett-Packard Development Company, L.P.|Correction of distortion in captured images|

US8285077B2|2008-07-15|2012-10-09|Nuance Communications, Inc.|Automatic correction of digital image distortion|

US9390342B2|2011-10-17|2016-07-12|Sharp Laboratories Of America, Inc.|Methods, systems and apparatus for correcting perspective distortion in a document image|US9767354B2|2009-02-10|2017-09-19|Kofax, Inc.|Global geographic information retrieval, validation, and normalization|

US10146795B2|2012-01-12|2018-12-04|Kofax, Inc.|Systems and methods for mobile image capture and processing|

US9165187B2|2012-01-12|2015-10-20|Kofax, Inc.|Systems and methods for mobile image capture and processing|

US9355312B2|2013-03-13|2016-05-31|Kofax, Inc.|Systems and methods for classifying objects in digital images captured using mobile devices|

US20140316841A1|2013-04-23|2014-10-23|Kofax, Inc.|Location-based workflows and services|

US9208536B2|2013-09-27|2015-12-08|Kofax, Inc.|Systems and methods for three dimensional geometric reconstruction of captured image data|

JP2016538783A|2013-11-15|2016-12-08|コファックス，インコーポレイテッド|System and method for generating a composite image of a long document using mobile video data|

US9915857B2|2013-12-09|2018-03-13|Geo Semiconductor Inc.|System and method for automated test-pattern-free projection calibration|

US9760788B2|2014-10-30|2017-09-12|Kofax, Inc.|Mobile document detection and orientation based on reference object characteristics|

US10242285B2|2015-07-20|2019-03-26|Kofax, Inc.|Iterative recognition-guided thresholding and data extraction|

US11062176B2|2017-11-30|2021-07-13|Kofax, Inc.|Object detection and image cropping using a multi-detector approach|

RU2680765C1|2017-12-22|2019-02-26|Общество с ограниченной ответственностью "Аби Продакшн"|Automated determination and cutting of non-singular contour of a picture on an image|

法律状态:

优先权:

申请号 | 申请日 | 专利标题

US14/136,695|US8811751B1|2013-12-20|2013-12-20|Method and system for correcting projective distortions with elimination steps on multiple levels|

US14/136,695|2013-12-20|

[返回顶部]